Overview

Brought to you by YData

Dataset statistics

Number of variables17
Number of observations2121
Missing cells1806
Missing cells (%)5.0%
Duplicate rows13
Duplicate rows (%)0.6%
Total size in memory1.9 MiB
Average record size in memory929.6 B

Variable types

Categorical5
Text8
Boolean4

Alerts

Dataset has 13 (0.6%) duplicate rowsDuplicates
NObeyesdad is highly overall correlated with family_history_with_overweightHigh correlation
family_history_with_overweight is highly overall correlated with NObeyesdadHigh correlation
CAEC is highly imbalanced (57.8%)Imbalance
SMOKE is highly imbalanced (85.4%)Imbalance
SCC is highly imbalanced (73.5%)Imbalance
MTRANS is highly imbalanced (57.1%)Imbalance
Gender has 106 (5.0%) missing valuesMissing
Age has 105 (5.0%) missing valuesMissing
Height has 106 (5.0%) missing valuesMissing
Weight has 105 (5.0%) missing valuesMissing
family_history_with_overweight has 106 (5.0%) missing valuesMissing
FAVC has 107 (5.0%) missing valuesMissing
FCVC has 107 (5.0%) missing valuesMissing
NCP has 106 (5.0%) missing valuesMissing
CAEC has 107 (5.0%) missing valuesMissing
SMOKE has 106 (5.0%) missing valuesMissing
CH2O has 106 (5.0%) missing valuesMissing
SCC has 106 (5.0%) missing valuesMissing
FAF has 107 (5.0%) missing valuesMissing
TUE has 106 (5.0%) missing valuesMissing
CALC has 107 (5.0%) missing valuesMissing
MTRANS has 106 (5.0%) missing valuesMissing
NObeyesdad has 107 (5.0%) missing valuesMissing

Reproduction

Analysis started2025-11-03 00:14:27.802726
Analysis finished2025-11-03 00:14:29.126225
Duration1.32 second
Software versionydata-profiling vv4.17.0
Download configurationconfig.json

Variables

Gender
Categorical

Missing 

Distinct3
Distinct (%)0.1%
Missing106
Missing (%)5.0%
Memory size112.2 KiB
Male
1008 
Female
986 
12345
 
21

Length

Max length6
Median length4
Mean length4.9890819
Min length4

Characters and Unicode

Total characters10053
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowMale
4th rowMale
5th rowMale

Common Values

ValueCountFrequency (%)
Male1008
47.5%
Female986
46.5%
1234521
 
1.0%
(Missing)106
 
5.0%

Length

2025-11-02T21:14:29.177180image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-02T21:14:29.217331image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
male1008
50.0%
female986
48.9%
1234521
 
1.0%

Most occurring characters

ValueCountFrequency (%)
e2980
29.6%
l1994
19.8%
a1994
19.8%
M1008
 
10.0%
F986
 
9.8%
m986
 
9.8%
121
 
0.2%
221
 
0.2%
321
 
0.2%
421
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)10053
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e2980
29.6%
l1994
19.8%
a1994
19.8%
M1008
 
10.0%
F986
 
9.8%
m986
 
9.8%
121
 
0.2%
221
 
0.2%
321
 
0.2%
421
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)10053
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e2980
29.6%
l1994
19.8%
a1994
19.8%
M1008
 
10.0%
F986
 
9.8%
m986
 
9.8%
121
 
0.2%
221
 
0.2%
321
 
0.2%
421
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)10053
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e2980
29.6%
l1994
19.8%
a1994
19.8%
M1008
 
10.0%
F986
 
9.8%
m986
 
9.8%
121
 
0.2%
221
 
0.2%
321
 
0.2%
421
 
0.2%

Age
Text

Missing 

Distinct1326
Distinct (%)65.8%
Missing105
Missing (%)5.0%
Memory size115.4 KiB
2025-11-02T21:14:29.294597image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length11
Median length10
Mean length7.8829365
Min length4

Characters and Unicode

Total characters15892
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1282 ?
Unique (%)63.6%

Sample

1st row21.0
2nd row21.0
3rd row23.0
4th row27.0
5th row22.0
ValueCountFrequency (%)
18.0116
 
5.8%
26.094
 
4.7%
21.087
 
4.3%
23.080
 
4.0%
19.053
 
2.6%
20.045
 
2.2%
22.033
 
1.6%
17.026
 
1.3%
unknown_age22
 
1.1%
84.020
 
1.0%
Other values (1316)1440
71.4%
2025-11-02T21:14:29.421774image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
.3284
20.7%
22082
13.1%
11573
9.9%
01568
9.9%
31217
 
7.7%
91142
 
7.2%
81057
 
6.7%
4969
 
6.1%
5927
 
5.8%
6926
 
5.8%
Other values (10)1147
 
7.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)15892
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.3284
20.7%
22082
13.1%
11573
9.9%
01568
9.9%
31217
 
7.7%
91142
 
7.2%
81057
 
6.7%
4969
 
6.1%
5927
 
5.8%
6926
 
5.8%
Other values (10)1147
 
7.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)15892
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.3284
20.7%
22082
13.1%
11573
9.9%
01568
9.9%
31217
 
7.7%
91142
 
7.2%
81057
 
6.7%
4969
 
6.1%
5927
 
5.8%
6926
 
5.8%
Other values (10)1147
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)15892
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.3284
20.7%
22082
13.1%
11573
9.9%
01568
9.9%
31217
 
7.7%
91142
 
7.2%
81057
 
6.7%
4969
 
6.1%
5927
 
5.8%
6926
 
5.8%
Other values (10)1147
 
7.2%

Height
Text

Missing 

Distinct1483
Distinct (%)73.6%
Missing106
Missing (%)5.0%
Memory size114.4 KiB
2025-11-02T21:14:29.557296image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length9
Median length9
Mean length7.3811414
Min length3

Characters and Unicode

Total characters14873
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1423 ?
Unique (%)70.6%

Sample

1st row1.62
2nd row1.52
3rd row1.8
4th row1.8
5th row1.78
ValueCountFrequency (%)
1.759
 
2.9%
1.6547
 
2.3%
1.7536
 
1.8%
1.636
 
1.8%
1.6235
 
1.7%
1.826
 
1.3%
3.9621
 
1.0%
1.7219
 
0.9%
1.6317
 
0.8%
1.6716
 
0.8%
Other values (1473)1703
84.5%
2025-11-02T21:14:29.747751image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
.3318
22.3%
12762
18.6%
71465
9.9%
61425
9.6%
51090
 
7.3%
81067
 
7.2%
3832
 
5.6%
9802
 
5.4%
2786
 
5.3%
4764
 
5.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)14873
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.3318
22.3%
12762
18.6%
71465
9.9%
61425
9.6%
51090
 
7.3%
81067
 
7.2%
3832
 
5.6%
9802
 
5.4%
2786
 
5.3%
4764
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)14873
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.3318
22.3%
12762
18.6%
71465
9.9%
61425
9.6%
51090
 
7.3%
81067
 
7.2%
3832
 
5.6%
9802
 
5.4%
2786
 
5.3%
4764
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)14873
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.3318
22.3%
12762
18.6%
71465
9.9%
61425
9.6%
51090
 
7.3%
81067
 
7.2%
3832
 
5.6%
9802
 
5.4%
2786
 
5.3%
4764
 
5.1%

Weight
Text

Missing 

Distinct1447
Distinct (%)71.8%
Missing105
Missing (%)5.0%
Memory size116.3 KiB
2025-11-02T21:14:29.872028image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length11
Median length10
Mean length8.3263889
Min length4

Characters and Unicode

Total characters16786
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1381 ?
Unique (%)68.5%

Sample

1st row64.0
2nd row56.0
3rd row77.0
4th row87.0
5th row89.8
ValueCountFrequency (%)
80.052
 
2.6%
70.040
 
2.0%
50.039
 
1.9%
75.037
 
1.8%
60.034
 
1.7%
65.024
 
1.2%
42.022
 
1.1%
865.021
 
1.0%
90.018
 
0.9%
85.017
 
0.8%
Other values (1437)1712
84.9%
2025-11-02T21:14:30.056471image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
.3368
20.1%
11914
11.4%
01905
11.3%
81366
8.1%
51336
 
8.0%
91234
 
7.4%
71179
 
7.0%
21173
 
7.0%
61144
 
6.8%
31088
 
6.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)16786
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.3368
20.1%
11914
11.4%
01905
11.3%
81366
8.1%
51336
 
8.0%
91234
 
7.4%
71179
 
7.0%
21173
 
7.0%
61144
 
6.8%
31088
 
6.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)16786
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.3368
20.1%
11914
11.4%
01905
11.3%
81366
8.1%
51336
 
8.0%
91234
 
7.4%
71179
 
7.0%
21173
 
7.0%
61144
 
6.8%
31088
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)16786
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.3368
20.1%
11914
11.4%
01905
11.3%
81366
8.1%
51336
 
8.0%
91234
 
7.4%
71179
 
7.0%
21173
 
7.0%
61144
 
6.8%
31088
 
6.5%

family_history_with_overweight
Boolean

High correlation  Missing 

Distinct2
Distinct (%)0.1%
Missing106
Missing (%)5.0%
Memory size4.3 KiB
True
1649 
False
366 
(Missing)
 
106
ValueCountFrequency (%)
True1649
77.7%
False366
 
17.3%
(Missing)106
 
5.0%
2025-11-02T21:14:30.093935image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

FAVC
Boolean

Missing 

Distinct2
Distinct (%)0.1%
Missing107
Missing (%)5.0%
Memory size4.3 KiB
True
1788 
False
226 
(Missing)
 
107
ValueCountFrequency (%)
True1788
84.3%
False226
 
10.7%
(Missing)107
 
5.0%
2025-11-02T21:14:30.120882image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

FCVC
Text

Missing 

Distinct765
Distinct (%)38.0%
Missing107
Missing (%)5.0%
Memory size110.2 KiB
2025-11-02T21:14:30.256336image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length9
Median length3
Mean length5.2428004
Min length3

Characters and Unicode

Total characters10559
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique739 ?
Unique (%)36.7%

Sample

1st row2.0
2nd row3.0
3rd row2.0
4th row3.0
5th row2.0
ValueCountFrequency (%)
3.0618
30.7%
2.0578
28.7%
1.033
 
1.6%
2.737.7622
 
0.1%
2.823.1792
 
0.1%
2.758.3942
 
0.1%
2.568.0632
 
0.1%
2.630.1372
 
0.1%
29.6732
 
0.1%
2.971.5742
 
0.1%
Other values (755)771
38.3%
2025-11-02T21:14:30.445956image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
.2709
25.7%
21673
15.8%
01617
15.3%
31046
 
9.9%
1658
 
6.2%
9521
 
4.9%
4484
 
4.6%
6483
 
4.6%
8467
 
4.4%
7455
 
4.3%

Most occurring categories

ValueCountFrequency (%)
(unknown)10559
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.2709
25.7%
21673
15.8%
01617
15.3%
31046
 
9.9%
1658
 
6.2%
9521
 
4.9%
4484
 
4.6%
6483
 
4.6%
8467
 
4.4%
7455
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown)10559
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.2709
25.7%
21673
15.8%
01617
15.3%
31046
 
9.9%
1658
 
6.2%
9521
 
4.9%
4484
 
4.6%
6483
 
4.6%
8467
 
4.4%
7455
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown)10559
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.2709
25.7%
21673
15.8%
01617
15.3%
31046
 
9.9%
1658
 
6.2%
9521
 
4.9%
4484
 
4.6%
6483
 
4.6%
8467
 
4.4%
7455
 
4.3%

NCP
Text

Missing 

Distinct599
Distinct (%)29.7%
Missing106
Missing (%)5.0%
Memory size109.2 KiB
2025-11-02T21:14:30.576077image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length9
Median length3
Mean length4.7384615
Min length3

Characters and Unicode

Total characters9548
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique586 ?
Unique (%)29.1%

Sample

1st row3.0
2nd row3.0
3rd row3.0
4th row3.0
5th row1.0
ValueCountFrequency (%)
3.01155
57.3%
1.0186
 
9.2%
4.068
 
3.4%
3.559.8412
 
0.1%
2.375.0262
 
0.1%
1.120.1022
 
0.1%
2.644.6922
 
0.1%
1.894.3842
 
0.1%
3.691.2262
 
0.1%
173.7622
 
0.1%
Other values (589)592
29.4%
2025-11-02T21:14:30.751805image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
.2558
26.8%
01723
18.0%
31662
17.4%
1758
 
7.9%
2582
 
6.1%
4423
 
4.4%
9406
 
4.3%
8378
 
4.0%
7362
 
3.8%
6353
 
3.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)9548
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.2558
26.8%
01723
18.0%
31662
17.4%
1758
 
7.9%
2582
 
6.1%
4423
 
4.4%
9406
 
4.3%
8378
 
4.0%
7362
 
3.8%
6353
 
3.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)9548
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.2558
26.8%
01723
18.0%
31662
17.4%
1758
 
7.9%
2582
 
6.1%
4423
 
4.4%
9406
 
4.3%
8378
 
4.0%
7362
 
3.8%
6353
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)9548
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.2558
26.8%
01723
18.0%
31662
17.4%
1758
 
7.9%
2582
 
6.1%
4423
 
4.4%
9406
 
4.3%
8378
 
4.0%
7362
 
3.8%
6353
 
3.7%

CAEC
Categorical

Imbalance  Missing 

Distinct4
Distinct (%)0.2%
Missing107
Missing (%)5.0%
Memory size119.8 KiB
Sometimes
1682 
Frequently
230 
Always
 
51
no
 
51

Length

Max length10
Median length9
Mean length8.8609732
Min length2

Characters and Unicode

Total characters17846
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSometimes
2nd rowSometimes
3rd rowSometimes
4th rowSometimes
5th rowSometimes

Common Values

ValueCountFrequency (%)
Sometimes1682
79.3%
Frequently230
 
10.8%
Always51
 
2.4%
no51
 
2.4%
(Missing)107
 
5.0%

Length

2025-11-02T21:14:30.805999image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-02T21:14:30.841456image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
sometimes1682
83.5%
frequently230
 
11.4%
always51
 
2.5%
no51
 
2.5%

Most occurring characters

ValueCountFrequency (%)
e3824
21.4%
m3364
18.9%
t1912
10.7%
s1733
9.7%
o1733
9.7%
S1682
9.4%
i1682
9.4%
n281
 
1.6%
l281
 
1.6%
y281
 
1.6%
Other values (7)1073
 
6.0%

Most occurring categories

ValueCountFrequency (%)
(unknown)17846
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e3824
21.4%
m3364
18.9%
t1912
10.7%
s1733
9.7%
o1733
9.7%
S1682
9.4%
i1682
9.4%
n281
 
1.6%
l281
 
1.6%
y281
 
1.6%
Other values (7)1073
 
6.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown)17846
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e3824
21.4%
m3364
18.9%
t1912
10.7%
s1733
9.7%
o1733
9.7%
S1682
9.4%
i1682
9.4%
n281
 
1.6%
l281
 
1.6%
y281
 
1.6%
Other values (7)1073
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown)17846
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e3824
21.4%
m3364
18.9%
t1912
10.7%
s1733
9.7%
o1733
9.7%
S1682
9.4%
i1682
9.4%
n281
 
1.6%
l281
 
1.6%
y281
 
1.6%
Other values (7)1073
 
6.0%

SMOKE
Boolean

Imbalance  Missing 

Distinct2
Distinct (%)0.1%
Missing106
Missing (%)5.0%
Memory size4.3 KiB
False
1973 
True
 
42
(Missing)
 
106
ValueCountFrequency (%)
False1973
93.0%
True42
 
2.0%
(Missing)106
 
5.0%
2025-11-02T21:14:30.871433image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

CH2O
Text

Missing 

Distinct1209
Distinct (%)60.0%
Missing106
Missing (%)5.0%
Memory size112.8 KiB
2025-11-02T21:14:30.985854image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length9
Median length9
Mean length6.569727
Min length3

Characters and Unicode

Total characters13238
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1178 ?
Unique (%)58.5%

Sample

1st row2.0
2nd row3.0
3rd row2.0
4th row2.0
5th row2.0
ValueCountFrequency (%)
2.0422
 
20.9%
1.0203
 
10.1%
3.0155
 
7.7%
2.825.6293
 
0.1%
2.618.1982
 
0.1%
1.031.3542
 
0.1%
213.7552
 
0.1%
2.184.7072
 
0.1%
2.406.5412
 
0.1%
2.371.0152
 
0.1%
Other values (1199)1220
60.5%
2025-11-02T21:14:31.164324image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
.3145
23.8%
21792
13.5%
11525
11.5%
01446
10.9%
3901
 
6.8%
7777
 
5.9%
6758
 
5.7%
5756
 
5.7%
9751
 
5.7%
4709
 
5.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)13238
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.3145
23.8%
21792
13.5%
11525
11.5%
01446
10.9%
3901
 
6.8%
7777
 
5.9%
6758
 
5.7%
5756
 
5.7%
9751
 
5.7%
4709
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)13238
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.3145
23.8%
21792
13.5%
11525
11.5%
01446
10.9%
3901
 
6.8%
7777
 
5.9%
6758
 
5.7%
5756
 
5.7%
9751
 
5.7%
4709
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)13238
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.3145
23.8%
21792
13.5%
11525
11.5%
01446
10.9%
3901
 
6.8%
7777
 
5.9%
6758
 
5.7%
5756
 
5.7%
9751
 
5.7%
4709
 
5.4%

SCC
Boolean

Imbalance  Missing 

Distinct2
Distinct (%)0.1%
Missing106
Missing (%)5.0%
Memory size4.3 KiB
False
1924 
True
 
91
(Missing)
 
106
ValueCountFrequency (%)
False1924
90.7%
True91
 
4.3%
(Missing)106
 
5.0%
2025-11-02T21:14:31.198479image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

FAF
Text

Missing 

Distinct1141
Distinct (%)56.7%
Missing107
Missing (%)5.0%
Memory size111.8 KiB
2025-11-02T21:14:31.330991image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length9
Median length8
Mean length6.0600794
Min length3

Characters and Unicode

Total characters12205
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1114 ?
Unique (%)55.3%

Sample

1st row0.0
2nd row3.0
3rd row2.0
4th row2.0
5th row0.0
ValueCountFrequency (%)
0.0393
 
19.5%
1.0216
 
10.7%
2.0175
 
8.7%
3.070
 
3.5%
0.9352172
 
0.1%
1.541.0722
 
0.1%
0.2103512
 
0.1%
1.978.6312
 
0.1%
1.399.1832
 
0.1%
0.3456842
 
0.1%
Other values (1131)1148
57.0%
2025-11-02T21:14:31.518486image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
.2523
20.7%
02445
20.0%
11379
11.3%
21033
8.5%
9755
 
6.2%
3750
 
6.1%
8677
 
5.5%
6673
 
5.5%
5673
 
5.5%
7659
 
5.4%
Other values (3)638
 
5.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)12205
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
.2523
20.7%
02445
20.0%
11379
11.3%
21033
8.5%
9755
 
6.2%
3750
 
6.1%
8677
 
5.5%
6673
 
5.5%
5673
 
5.5%
7659
 
5.4%
Other values (3)638
 
5.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)12205
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
.2523
20.7%
02445
20.0%
11379
11.3%
21033
8.5%
9755
 
6.2%
3750
 
6.1%
8677
 
5.5%
6673
 
5.5%
5673
 
5.5%
7659
 
5.4%
Other values (3)638
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)12205
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
.2523
20.7%
02445
20.0%
11379
11.3%
21033
8.5%
9755
 
6.2%
3750
 
6.1%
8677
 
5.5%
6673
 
5.5%
5673
 
5.5%
7659
 
5.4%
Other values (3)638
 
5.2%

TUE
Text

Missing 

Distinct1071
Distinct (%)53.2%
Missing106
Missing (%)5.0%
Memory size111.2 KiB
2025-11-02T21:14:31.628158image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length9
Median length8
Mean length5.7826303
Min length3

Characters and Unicode

Total characters11652
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1040 ?
Unique (%)51.6%

Sample

1st row1.0
2nd row0.0
3rd row1.0
4th row0.0
5th row0.0
ValueCountFrequency (%)
0.0534
26.5%
1.0275
 
13.6%
2.0105
 
5.2%
0.6308664
 
0.2%
159.2573
 
0.1%
1.119.8773
 
0.1%
0.00263
 
0.1%
1.250.8712
 
0.1%
0.4697352
 
0.1%
0.3719412
 
0.1%
Other values (1061)1082
53.7%
2025-11-02T21:14:31.787307image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
02825
24.2%
.2268
19.5%
11191
10.2%
2727
 
6.2%
6727
 
6.2%
3687
 
5.9%
8676
 
5.8%
9659
 
5.7%
5652
 
5.6%
7633
 
5.4%
Other values (3)607
 
5.2%

Most occurring categories

ValueCountFrequency (%)
(unknown)11652
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
02825
24.2%
.2268
19.5%
11191
10.2%
2727
 
6.2%
6727
 
6.2%
3687
 
5.9%
8676
 
5.8%
9659
 
5.7%
5652
 
5.6%
7633
 
5.4%
Other values (3)607
 
5.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown)11652
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
02825
24.2%
.2268
19.5%
11191
10.2%
2727
 
6.2%
6727
 
6.2%
3687
 
5.9%
8676
 
5.8%
9659
 
5.7%
5652
 
5.6%
7633
 
5.4%
Other values (3)607
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown)11652
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
02825
24.2%
.2268
19.5%
11191
10.2%
2727
 
6.2%
6727
 
6.2%
3687
 
5.9%
8676
 
5.8%
9659
 
5.7%
5652
 
5.6%
7633
 
5.4%
Other values (3)607
 
5.2%

CALC
Categorical

Missing 

Distinct4
Distinct (%)0.2%
Missing107
Missing (%)5.0%
Memory size115.9 KiB
Sometimes
1336 
no
610 
Frequently
 
67
Always
 
1

Length

Max length10
Median length9
Mean length6.9116187
Min length2

Characters and Unicode

Total characters13920
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowno
2nd rowSometimes
3rd rowFrequently
4th rowFrequently
5th rowSometimes

Common Values

ValueCountFrequency (%)
Sometimes1336
63.0%
no610
28.8%
Frequently67
 
3.2%
Always1
 
< 0.1%
(Missing)107
 
5.0%

Length

2025-11-02T21:14:31.841329image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-02T21:14:31.876970image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
sometimes1336
66.3%
no610
30.3%
frequently67
 
3.3%
always1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e2806
20.2%
m2672
19.2%
o1946
14.0%
t1403
10.1%
s1337
9.6%
S1336
9.6%
i1336
9.6%
n677
 
4.9%
l68
 
0.5%
y68
 
0.5%
Other values (7)271
 
1.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)13920
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e2806
20.2%
m2672
19.2%
o1946
14.0%
t1403
10.1%
s1337
9.6%
S1336
9.6%
i1336
9.6%
n677
 
4.9%
l68
 
0.5%
y68
 
0.5%
Other values (7)271
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)13920
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e2806
20.2%
m2672
19.2%
o1946
14.0%
t1403
10.1%
s1337
9.6%
S1336
9.6%
i1336
9.6%
n677
 
4.9%
l68
 
0.5%
y68
 
0.5%
Other values (7)271
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)13920
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e2806
20.2%
m2672
19.2%
o1946
14.0%
t1403
10.1%
s1337
9.6%
S1336
9.6%
i1336
9.6%
n677
 
4.9%
l68
 
0.5%
y68
 
0.5%
Other values (7)271
 
1.9%

MTRANS
Categorical

Imbalance  Missing 

Distinct5
Distinct (%)0.2%
Missing106
Missing (%)5.0%
Memory size138.0 KiB
Public_Transportation
1507 
Automobile
439 
Walking
 
51
Motorbike
 
11
Bike
 
7

Length

Max length21
Median length21
Mean length18.124566
Min length4

Characters and Unicode

Total characters36521
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPublic_Transportation
2nd rowPublic_Transportation
3rd rowPublic_Transportation
4th rowWalking
5th rowPublic_Transportation

Common Values

ValueCountFrequency (%)
Public_Transportation1507
71.1%
Automobile439
 
20.7%
Walking51
 
2.4%
Motorbike11
 
0.5%
Bike7
 
0.3%
(Missing)106
 
5.0%

Length

2025-11-02T21:14:31.921405image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-02T21:14:31.962021image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
public_transportation1507
74.8%
automobile439
 
21.8%
walking51
 
2.5%
motorbike11
 
0.5%
bike7
 
0.3%

Most occurring characters

ValueCountFrequency (%)
o3914
10.7%
i3522
 
9.6%
t3464
 
9.5%
a3065
 
8.4%
n3065
 
8.4%
r3025
 
8.3%
l1997
 
5.5%
b1957
 
5.4%
u1946
 
5.3%
P1507
 
4.1%
Other values (13)9059
24.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)36521
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o3914
10.7%
i3522
 
9.6%
t3464
 
9.5%
a3065
 
8.4%
n3065
 
8.4%
r3025
 
8.3%
l1997
 
5.5%
b1957
 
5.4%
u1946
 
5.3%
P1507
 
4.1%
Other values (13)9059
24.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)36521
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o3914
10.7%
i3522
 
9.6%
t3464
 
9.5%
a3065
 
8.4%
n3065
 
8.4%
r3025
 
8.3%
l1997
 
5.5%
b1957
 
5.4%
u1946
 
5.3%
P1507
 
4.1%
Other values (13)9059
24.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)36521
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o3914
10.7%
i3522
 
9.6%
t3464
 
9.5%
a3065
 
8.4%
n3065
 
8.4%
r3025
 
8.3%
l1997
 
5.5%
b1957
 
5.4%
u1946
 
5.3%
P1507
 
4.1%
Other values (13)9059
24.8%

NObeyesdad
Categorical

High correlation  Missing 

Distinct7
Distinct (%)0.3%
Missing107
Missing (%)5.0%
Memory size134.2 KiB
Obesity_Type_I
342 
Obesity_Type_III
305 
Overweight_Level_II
283 
Obesity_Type_II
279 
Overweight_Level_I
277 
Other values (2)
528 

Length

Max length19
Median length16
Mean length16.191658
Min length13

Characters and Unicode

Total characters32610
Distinct characters27
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNormal_Weight
2nd rowNormal_Weight
3rd rowNormal_Weight
4th rowOverweight_Level_I
5th rowOverweight_Level_II

Common Values

ValueCountFrequency (%)
Obesity_Type_I342
16.1%
Obesity_Type_III305
14.4%
Overweight_Level_II283
13.3%
Obesity_Type_II279
13.2%
Overweight_Level_I277
13.1%
Normal_Weight273
12.9%
Insufficient_Weight255
12.0%
(Missing)107
 
5.0%

Length

2025-11-02T21:14:32.012311image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-11-02T21:14:32.059019image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
obesity_type_i342
17.0%
obesity_type_iii305
15.1%
overweight_level_ii283
14.1%
obesity_type_ii279
13.9%
overweight_level_i277
13.8%
normal_weight273
13.6%
insufficient_weight255
12.7%

Most occurring characters

ValueCountFrequency (%)
e4875
14.9%
_3500
 
10.7%
I2913
 
8.9%
i2524
 
7.7%
t2269
 
7.0%
y1852
 
5.7%
O1486
 
4.6%
s1181
 
3.6%
v1120
 
3.4%
g1088
 
3.3%
Other values (17)9802
30.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)32610
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e4875
14.9%
_3500
 
10.7%
I2913
 
8.9%
i2524
 
7.7%
t2269
 
7.0%
y1852
 
5.7%
O1486
 
4.6%
s1181
 
3.6%
v1120
 
3.4%
g1088
 
3.3%
Other values (17)9802
30.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)32610
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e4875
14.9%
_3500
 
10.7%
I2913
 
8.9%
i2524
 
7.7%
t2269
 
7.0%
y1852
 
5.7%
O1486
 
4.6%
s1181
 
3.6%
v1120
 
3.4%
g1088
 
3.3%
Other values (17)9802
30.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)32610
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e4875
14.9%
_3500
 
10.7%
I2913
 
8.9%
i2524
 
7.7%
t2269
 
7.0%
y1852
 
5.7%
O1486
 
4.6%
s1181
 
3.6%
v1120
 
3.4%
g1088
 
3.3%
Other values (17)9802
30.1%

Correlations

2025-11-02T21:14:32.120415image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
CAECCALCFAVCGenderMTRANSNObeyesdadSCCSMOKEfamily_history_with_overweight
CAEC1.0000.1050.2070.0840.0930.3490.1670.0480.337
CALC0.1051.0000.1270.0110.0990.2230.0470.1060.032
FAVC0.2070.1271.0000.0520.1770.3250.1990.0330.214
Gender0.0840.0110.0521.0000.1260.3930.0980.0410.116
MTRANS0.0930.0990.1770.1261.0000.1780.0750.0030.124
NObeyesdad0.3490.2230.3250.3930.1781.0000.2320.1020.540
SCC0.1670.0470.1990.0980.0750.2321.0000.0390.190
SMOKE0.0480.1060.0330.0410.0030.1020.0391.0000.000
family_history_with_overweight0.3370.0320.2140.1160.1240.5400.1900.0001.000

Missing values

2025-11-02T21:14:28.656861image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-11-02T21:14:28.750978image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-11-02T21:14:29.054034image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

GenderAgeHeightWeightfamily_history_with_overweightFAVCFCVCNCPCAECSMOKECH2OSCCFAFTUECALCMTRANSNObeyesdad
0Female21.01.6264.0yesno2.03.0Sometimesno2.0no0.01.0noPublic_TransportationNormal_Weight
1Female21.01.5256.0yesno3.03.0Sometimesyes3.0yes3.00.0SometimesPublic_TransportationNormal_Weight
2Male23.01.877.0yesNaN2.03.0Sometimesno2.0no2.01.0FrequentlyPublic_TransportationNormal_Weight
3Male27.01.887.0nono3.03.0Sometimesno2.0no2.00.0FrequentlyWalkingOverweight_Level_I
4Male22.01.7889.8nonoNaN1.0Sometimesno2.0no0.00.0SometimesPublic_TransportationOverweight_Level_II
5Male29.01.6253.0noNaN2.03.0Sometimesno2.0no0.00.0SometimesAutomobileNormal_Weight
61234523.01.555.0yesyes3.0NaNSometimesno2.0no1.00.0SometimesMotorbikeNormal_Weight
7Male22.01.6453.0nono2.03.0NaNno2.0no3.00.0SometimesPublic_TransportationNormal_Weight
8Male24.01.7864.0yesyes3.03.0Sometimesno2.0no1.01.0FrequentlyPublic_TransportationNormal_Weight
9Male22.01.7268.0yesyes2.03.0Sometimesno2.0no1.01.0noPublic_TransportationNormal_Weight
GenderAgeHeightWeightfamily_history_with_overweightFAVCFCVCNCPCAECSMOKECH2OSCCFAFTUECALCMTRANSNObeyesdad
2111Female19.994.5431.537.739NaNnoNaN1.118.4363.0Sometimesno1.997.744no2.432.4431.626.194SometimesPublic_TransportationInsufficient_Weight
2112Female22.033.1291.704.22351.437.985yesyes2.984.4253.0Frequentlyno2.044.694no2.008.2561.250.871noPublic_TransportationNaN
2113NaNNaN1.690.262103.180.918yesyes2.649.4061.120.102NaNno1.153.286no0.2169080.619012NaNPublic_TransportationObesity_Type_II
2114Male19.858.973NaN69.575.315noyes2.185.9381.0nono281.353no1.01.110.222SometimesPublic_TransportationOverweight_Level_I
2115Female26.01.618.867NaNyesyes3.03.0Sometimesno2.618.198no0.00.380695SometimesPublic_TransportationObesity_Type_III
2116Male17.039.0581.799.90295.419.668yesyes2.03.0Sometimesno3.0no104.7290.933802SometimesPublic_TransportationOverweight_Level_II
2117Female17.000.4331.584.95144.411.801noyes2.737.7623.0Sometimesno2.310.921no2.240.714159.257SometimesPublic_TransportationInsufficient_Weight
2118Male22.882.5581.793.45189.909.259yesyes1.899.1162.375.026Sometimesno139.854noNaN1.365.793SometimesPublic_TransportationOverweight_Level_II
2119Femaleunknown_age1.639.524111.945.588yesyes3.03.0Sometimesno2.739.351no0.00.064769SometimesPublic_TransportationObesity_Type_III
2120Female39.170.0291.688.35479.278.896yesyesNaN3.0Sometimesno2.994.515no0.00.0noAutomobileOverweight_Level_II

Duplicate rows

Most frequently occurring

GenderAgeHeightWeightfamily_history_with_overweightFAVCFCVCNCPCAECSMOKECH2OSCCFAFTUECALCMTRANSNObeyesdad# duplicates
9Male21.01.6270.0noyes2.01.0nono3.0no1.00.0SometimesPublic_TransportationOverweight_Level_I7
0Female17.000.4331.584.95144.411.801noyes2.737.7623.0Sometimesno2.310.921no2.240.714159.257SometimesPublic_TransportationInsufficient_Weight2
1Female19.994.5431.537.739NaNnoNaN1.118.4363.0Sometimesno1.997.744no2.432.4431.626.194SometimesPublic_TransportationInsufficient_Weight2
2Female22.033.1291.704.22351.437.985yesyes2.984.4253.0Frequentlyno2.044.694no2.008.2561.250.871noPublic_TransportationNaN2
3Female25.01.5755.0noyes2.01.0Sometimesno2.0no2.00.0SometimesPublic_TransportationNormal_Weight2
4Female26.01.618.867NaNyesyes3.03.0Sometimesno2.618.198no0.00.380695SometimesPublic_TransportationObesity_Type_III2
5Female39.170.0291.688.35479.278.896yesyesNaN3.0Sometimesno2.994.515no0.00.0noAutomobileOverweight_Level_II2
6Femaleunknown_age1.639.524111.945.588yesyes3.03.0Sometimesno2.739.351no0.00.064769SometimesPublic_TransportationObesity_Type_III2
7Male17.039.0581.799.90295.419.668yesyes2.03.0Sometimesno3.0no104.7290.933802SometimesPublic_TransportationOverweight_Level_II2
8Male19.858.973NaN69.575.315noyes2.185.9381.0nono281.353no1.01.110.222SometimesPublic_TransportationOverweight_Level_I2